智能论文笔记

Through-life Monitoring of Resource-constrained Systems and Fleets

Felipe Montana , Adam Hartwell , Will Jacobs , Visakan Kadirkamanathan , Andrew R Mills , Tom Clark

分类：机器学习

2023-01-03

A Digital Twin (DT) is a simulation of a physical system that provides information to make decisions that add economic, social or commercial value. The behaviour of a physical system changes over time, a DT must therefore be continually updated with data from the physical systems to reflect its changing behaviour. For resource-constrained systems, updating a DT is non-trivial because of challenges such as on-board learning and the off-board data transfer. This paper presents a framework for updating data-driven DTs of resource-constrained systems geared towards system health monitoring. The proposed solution consists of: (1) an on-board system running a light-weight DT allowing the prioritisation and parsimonious transfer of data generated by the physical system; and (2) off-board robust updating of the DT and detection of anomalous behaviours. Two case studies are considered using a production gas turbine engine system to demonstrate the digital representation accuracy for real-world, time-varying physical systems.

translated by 谷歌翻译

In-flight Novelty Detection with Convolutional Neural Networks

Adam Hartwell , Felipe Montana , Will Jacobs , Visakan Kadirkamanathan , Andrew R Mills , Tom Clark

分类：机器学习

2021-12-07

燃气轮机发动机是复杂的机器，通常产生大量数据，并且需要仔细监控，以允许具有成本效益的预防性维护。在航空航天应用中，将所有测量数据返回到地面是昂贵的，通常会导致有用，高值，要丢弃的数据。因此，在实时检测，优先级和返回有用数据的能力是至关重要的。本文提出了由卷积神经网络常态模型描述的系统输出测量，实时优先考虑预防性维护决策者。由于燃气轮机发动机时变行为的复杂性，导出精确的物理模型难以困难，并且通常导致预测精度低的模型和与实时执行不相容。数据驱动的建模是一种理想的替代方案，生产高精度，资产特定模型，而无需从第一原理推导。我们提出了一种用于在线检测和异常数据的优先级的数据驱动系统。通过集成到深神经预测模型中的不确定管理，避免了偏离新的操作条件的数据评估。测试是对实际和合成数据进行的，显示对真实和合成故障的敏感性。该系统能够在低功耗嵌入式硬件上实时运行，目前正在部署Rolls-Royce Pearl 15发动机飞行试验。

translated by 谷歌翻译

Ranking Differential Privacy

Shirong Xu , Will Wei Sun , Guang Cheng

分类： (统计)机器学习 | 机器学习

2023-01-02

Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.

translated by 谷歌翻译

Online Statistical Inference for Matrix Contextual Bandit

Qiyu Han , Will Wei Sun , Yichen Zhang

分类： (统计)机器学习 | 机器学习

2022-12-21

Contextual bandit has been widely used for sequential decision-making based on the current contextual information and historical feedback data. In modern applications, such context format can be rich and can often be formulated as a matrix. Moreover, while existing bandit algorithms mainly focused on reward-maximization, less attention has been paid to the statistical inference. To fill in these gaps, in this work we consider a matrix contextual bandit framework where the true model parameter is a low-rank matrix, and propose a fully online procedure to simultaneously make sequential decision-making and conduct statistical inference. The low-rank structure of the model parameter and the adaptivity nature of the data collection process makes this difficult: standard low-rank estimators are not fully online and are biased, while existing inference approaches in bandit algorithms fail to account for the low-rankness and are also biased. To address these, we introduce a new online doubly-debiasing inference procedure to simultaneously handle both sources of bias. In theory, we establish the asymptotic normality of the proposed online doubly-debiased estimator and prove the validity of the constructed confidence interval. Our inference results are built upon a newly developed low-rank stochastic gradient descent estimator and its non-asymptotic convergence result, which is also of independent interest.

translated by 谷歌翻译

Settling the Reward Hypothesis

Michael Bowling , John D. Martin , David Abel , Will Dabney

分类：人工智能 | 机器学习

2022-12-20

The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)." We aim to fully settle this hypothesis. This will not conclude with a simple affirmation or refutation, but rather specify completely the implicit requirements on goals and purposes under which the hypothesis holds.

translated by 谷歌翻译

'If you build they will come': Automatic Identification of News-Stakeholders to detect Party Preference in News Coverage

Alapan Kuila , Sudeshna Sarkar

分类：自然语言处理 | 人工智能

2022-12-17

The coverage of different stakeholders mentioned in the news articles significantly impacts the slant or polarity detection of the concerned news publishers. For instance, the pro-government media outlets would give more coverage to the government stakeholders to increase their accessibility to the news audiences. In contrast, the anti-government news agencies would focus more on the views of the opponent stakeholders to inform the readers about the shortcomings of government policies. In this paper, we address the problem of stakeholder extraction from news articles and thereby determine the inherent bias present in news reporting. Identifying potential stakeholders in multi-topic news scenarios is challenging because each news topic has different stakeholders. The research presented in this paper utilizes both contextual information and external knowledge to identify the topic-specific stakeholders from news articles. We also apply a sequential incremental clustering algorithm to group the entities with similar stakeholder types. We carried out all our experiments on news articles on four Indian government policies published by numerous national and international news agencies. We also further generalize our system, and the experimental results show that the proposed model can be extended to other news topics.

translated by 谷歌翻译

Forecasting Soil Moisture Using Domain Inspired Temporal Graph Convolution Neural Networks To Guide Sustainable Crop Management

Muneeza Azmat , Malvern Madondo , Kelsey Dipietro , Raya Horesh , Arun Bawa , Michael Jacobs , Raghavan Srinivasan , Fearghal O'Donncha

分类：机器学习

2022-12-12

Climate change, population growth, and water scarcity present unprecedented challenges for agriculture. This project aims to forecast soil moisture using domain knowledge and machine learning for crop management decisions that enable sustainable farming. Traditional methods for predicting hydrological response features require significant computational time and expertise. Recent work has implemented machine learning models as a tool for forecasting hydrological response features, but these models neglect a crucial component of traditional hydrological modeling that spatially close units can have vastly different hydrological responses. In traditional hydrological modeling, units with similar hydrological properties are grouped together and share model parameters regardless of their spatial proximity. Inspired by this domain knowledge, we have constructed a novel domain-inspired temporal graph convolution neural network. Our approach involves clustering units based on time-varying hydrological properties, constructing graph topologies for each cluster, and forecasting soil moisture using graph convolutions and a gated recurrent neural network. We have trained, validated, and tested our method on field-scale time series data consisting of approximately 99,000 hydrological response units spanning 40 years in a case study in northeastern United States. Comparison with existing models illustrates the effectiveness of using domain-inspired clustering with time series graph neural networks. The framework is being deployed as part of a pro bono social impact program. The trained models are being deployed on small-holding farms in central Texas.

translated by 谷歌翻译

Fallen Angel Bonds Investment and Bankruptcy Predictions Using Manual Models and Automated Machine Learning

Harrison Mateika , Juannan Jia , Linda Lillard , Noah Cronbaugh , Will Shin

分类：机器学习

2022-12-07

The primary aim of this research was to find a model that best predicts which fallen angel bonds would either potentially rise up back to investment grade bonds and which ones would fall into bankruptcy. To implement the solution, we thought that the ideal method would be to create an optimal machine learning model that could predict bankruptcies. Among the many machine learning models out there we decided to pick four classification methods: logistic regression, KNN, SVM, and NN. We also utilized an automated methods of Google Cloud's machine learning. The results of our model comparisons showed that the models did not predict bankruptcies very well on the original data set with the exception of Google Cloud's machine learning having a high precision score. However, our over-sampled and feature selection data set did perform very well. This could likely be due to the model being over-fitted to match the narrative of the over-sampled data (as in, it does not accurately predict data outside of this data set quite well). Therefore, we were not able to create a model that we are confident that would predict bankruptcies. However, we were able to find value out of this project in two key ways. The first is that Google Cloud's machine learning model in every metric and in every data set either outperformed or performed on par with the other models. The second is that we found that utilizing feature selection did not reduce predictive power that much. This means that we can reduce the amount of data to collect for future experimentation regarding predicting bankruptcies.

translated by 谷歌翻译

Understanding Self-Predictive Learning for Reinforcement Learning

Yunhao Tang , Zhaohan Daniel Guo , Pierre Harvey Richemond , Bernardo Ávila Pires , Yash Chandak , Rémi Munos , Mark Rowland , Mohammad Gheshlaghi Azar , Charline Le Lan , Clare Lyle

分类：机器学习 | 人工智能

2022-12-06

We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.

translated by 谷歌翻译

Hyperbolic Contrastive Learning for Visual Representations beyond Objects

Songwei Ge , Shlok Mishra , Simon Kornblith , Chun-Liang Li , David Jacobs

分类：计算机视觉 | 机器学习

2022-12-01

Although self-/un-supervised methods have led to rapid progress in visual representation learning, these methods generally treat objects and scenes using the same lens. In this paper, we focus on learning representations for objects and scenes that preserve the structure among them. Motivated by the observation that visually similar objects are close in the representation space, we argue that the scenes and objects should instead follow a hierarchical structure based on their compositionality. To exploit such a structure, we propose a contrastive learning framework where a Euclidean loss is used to learn object representations and a hyperbolic loss is used to encourage representations of scenes to lie close to representations of their constituent objects in a hyperbolic space. This novel hyperbolic objective encourages the scene-object hypernymy among the representations by optimizing the magnitude of their norms. We show that when pretraining on the COCO and OpenImages datasets, the hyperbolic loss improves downstream performance of several baselines across multiple datasets and tasks, including image classification, object detection, and semantic segmentation. We also show that the properties of the learned representations allow us to solve various vision tasks that involve the interaction between scenes and objects in a zero-shot fashion. Our code can be found at \url{https://github.com/shlokk/HCL/tree/main/HCL}.

translated by 谷歌翻译